| Estimation de population au 1er janvier, par département, sexe et grande classe d'âge | ...2 | ...3 | ...4 | ...5 | ...6 | ...7 | ...8 | ...9 | ...10 | ...11 | ...12 | ...13 | ...14 | ...15 | ...16 | ...17 | ...18 | ...19 | ...20 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Année 2023 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| Départements | NA | Ensemble | NA | NA | NA | NA | NA | Hommes | NA | NA | NA | NA | NA | Femmes | NA | NA | NA | NA | NA |
| NA | NA | 0 à 19 ans | 20 à 39 ans | 40 à 59 ans | 60 à 74 ans | 75 ans et plus | Total | 0 à 19 ans | 20 à 39 ans | 40 à 59 ans | 60 à 74 ans | 75 ans et plus | Total | 0 à 19 ans | 20 à 39 ans | 40 à 59 ans | 60 à 74 ans | 75 ans et plus | Total |
| 01 | Ain | 170727 | 149960 | 183127 | 109363 | 58760 | 671937 | 88293 | 75548 | 90943 | 52486 | 24308 | 331578 | 82434 | 74412 | 92184 | 56877 | 34452 | 340359 |
| 02 | Aisne | 125684 | 113454 | 132390 | 98737 | 52526 | 522791 | 64688 | 57214 | 65872 | 47242 | 20699 | 255715 | 60996 | 56240 | 66518 | 51495 | 31827 | 267076 |
What is the impact of socio-economic factors on Criminality in France?
Did you know?
This report is available online
The public repository can also be found here: Git repository
1 Introduction
1.1 Overview and motivations
This project will analyze the factors influencing Criminality in France. We chose this subject because we are both french and criminality consistently ranks as one of the most important priority for the French population. In our study, we wanted to focus on the social and economic factors contributing to this issue. France is often regarded as one of the major world powers, and yet, it is clear that similarly to the rest of the world, there are increasing safety concerns regarding crime. This subject has also made it’s way into everyday discussions as it proves to be a polarizing topic. In the last thirty years, Politicians have often promised to tackle crime through several means, and as the population wants more to be done against it, the government has often made it one of their top priority.
In recent times, there has been a noticeable rise in criminal activities across France. Regions such as “Seine Saint Denis” in the Paris area. has seen emboldened criminals committing crimes in broad daylight, contributing to a perception of increased chaos. Similar concerns have surfaced in cities like Marseille. Additionally, it’s worth noting that the prison population has reached an all-time high in July 2023. But what are the reasons for increased criminality?
The goal of our research is to synthesize the main reasons behind criminality. Medias have often presented various reasons to explain crime, our research will look at those hints and see if they may have any statistical significance. It should be noted that data regarding “Ethnic origin” is not available, this variable is sometimes singled out by some as one of the driver of criminality and unlike in the United States, France has outlawed any form of statistics on this subject. as such, we will mostly focus on socio-economics drivers behind crime. France is facing substantial socio-economics issues, unemployment, poverty, conditions of leaving, and school level. Our project explores the link between socio-economic factors and criminality rates. Establishing such a link would give a new outlook on crime and its causes. Additionally we would like to know whether, this data could be used by the government to address public safety. In this project we will get answers regarding the core human drivers that push humans to commit crimes and how their socio-economic environment or educational path may impact their ability to commit crime.
1.2 Associated papers
https://www.institutpourlajustice.org/content/2017/12/Pauvreté-et-délinquance.pdf
1.3 Research questions
The purpose of this study is to evaluate the impact of different socioeconomic factors on criminality in France. We are using for that a french government database which divide crimes in 14 categories. In our regression analysis, we decided to focus on the five most common types of crime:
Assault and Battery
Burglaries
Intentional damage and destructions
Drug use
Theft without violence
We will consider the following macroeconomic variables:
- Poverty rate,
- Unemployment rate
- School indicators(Success rate at an exam/Share of adults without a dimploma)
- Density of Population
- Results of previous presidential election.
- Immigration
These are the following question we will focus on:
- Can the criminality rate in France be explained using the unemployment rate?
- Can the criminality rate in France be explained using the poverty rate?
- Can the criminality rate in France be explained using education indicators ?
- Can the density of population in territories explain criminality?
- Can the immigration rate explain criminality?
- Can the previous presidential results in territories have an impact on criminality?
- Is there a correlation between these mentioned variables?
2 Data Wrangling
In this section,we will look at our raw datasets, what information they give us, how we wrangle/clean them to arrive at our final dataset. Lastly we will look at the mistakes in our data and any anomalies or outliers we may find.
Firstly, we will use a dataset with values by french state. This dataset will not be used for our Regression analysis as only having 95 observations is not enough to have a good statistical analysis.
2.1 Data by Department
2.1.1 Raw datasets
2.1.1.1 Population
There is one excel sheet per year, starting in 1975 and ending in 2023
This dataset gives us information regarding the population inside the dataset, the population is also divided by gender and age groups. We will only use the total number of people per department. It shoudl be noted that what we call our department number is a Character. This is due to the Corsica departments, split in two in 1975 and named 2A and 2B, they used to department number 20.
Source: Population by state
2.1.1.2 Crime
There is one excel sheet per department.
| Index | libellé index | _2022_08 | _2022_07 | _2022_06 | _2022_05 | _2022_04 | _2022_03 | _2022_02 | _2022_01 | _2021_12 | _2021_11 | _2021_10 | _2021_09 | _2021_08 | _2021_07 | _2021_06 | _2021_05 | _2021_04 | _2021_03 | _2021_02 | _2021_01 | _2020_12 | _2020_11 | _2020_10 | _2020_09 | _2020_08 | _2020_07 | _2020_06 | _2020_05 | _2020_04 | _2020_03 | _2020_02 | _2020_01 | _2019_12 | _2019_11 | _2019_10 | _2019_09 | _2019_08 | _2019_07 | _2019_06 | _2019_05 | _2019_04 | _2019_03 | _2019_02 | _2019_01 | _2018_12 | _2018_11 | _2018_10 | _2018_09 | _2018_08 | _2018_07 | _2018_06 | _2018_05 | _2018_04 | _2018_03 | _2018_02 | _2018_01 | _2017_12 | _2017_11 | _2017_10 | _2017_09 | _2017_08 | _2017_07 | _2017_06 | _2017_05 | _2017_04 | _2017_03 | _2017_02 | _2017_01 | _2016_12 | _2016_11 | _2016_10 | _2016_09 | _2016_08 | _2016_07 | _2016_06 | _2016_05 | _2016_04 | _2016_03 | _2016_02 | _2016_01 | _2015_12 | _2015_11 | _2015_10 | _2015_09 | _2015_08 | _2015_07 | _2015_06 | _2015_05 | _2015_04 | _2015_03 | _2015_02 | _2015_01 | _2014_12 | _2014_11 | _2014_10 | _2014_09 | _2014_08 | _2014_07 | _2014_06 | _2014_05 | _2014_04 | _2014_03 | _2014_02 | _2014_01 | _2013_12 | _2013_11 | _2013_10 | _2013_09 | _2013_08 | _2013_07 | _2013_06 | _2013_05 | _2013_04 | _2013_03 | _2013_02 | _2013_01 | _2012_12 | _2012_11 | _2012_10 | _2012_09 | _2012_08 | _2012_07 | _2012_06 | _2012_05 | _2012_04 | _2012_03 | _2012_02 | _2012_01 | _2011_12 | _2011_11 | _2011_10 | _2011_09 | _2011_08 | _2011_07 | _2011_06 | _2011_05 | _2011_04 | _2011_03 | _2011_02 | _2011_01 | _2010_12 | _2010_11 | _2010_10 | _2010_09 | _2010_08 | _2010_07 | _2010_06 | _2010_05 | _2010_04 | _2010_03 | _2010_02 | _2010_01 | _2009_12 | _2009_11 | _2009_10 | _2009_09 | _2009_08 | _2009_07 | _2009_06 | _2009_05 | _2009_04 | _2009_03 | _2009_02 | _2009_01 | _2008_12 | _2008_11 | _2008_10 | _2008_09 | _2008_08 | _2008_07 | _2008_06 | _2008_05 | _2008_04 | _2008_03 | _2008_02 | _2008_01 | _2007_12 | _2007_11 | _2007_10 | _2007_09 | _2007_08 | _2007_07 | _2007_06 | _2007_05 | _2007_04 | _2007_03 | _2007_02 | _2007_01 | _2006_12 | _2006_11 | _2006_10 | _2006_09 | _2006_08 | _2006_07 | _2006_06 | _2006_05 | _2006_04 | _2006_03 | _2006_02 | _2006_01 | _2005_12 | _2005_11 | _2005_10 | _2005_09 | _2005_08 | _2005_07 | _2005_06 | _2005_05 | _2005_04 | _2005_03 | _2005_02 | _2005_01 | _2004_12 | _2004_11 | _2004_10 | _2004_09 | _2004_08 | _2004_07 | _2004_06 | _2004_05 | _2004_04 | _2004_03 | _2004_02 | _2004_01 | _2003_12 | _2003_11 | _2003_10 | _2003_09 | _2003_08 | _2003_07 | _2003_06 | _2003_05 | _2003_04 | _2003_03 | _2003_02 | _2003_01 | _2002_12 | _2002_11 | _2002_10 | _2002_09 | _2002_08 | _2002_07 | _2002_06 | _2002_05 | _2002_04 | _2002_03 | _2002_02 | _2002_01 | _2001_12 | _2001_11 | _2001_10 | _2001_09 | _2001_08 | _2001_07 | _2001_06 | _2001_05 | _2001_04 | _2001_03 | _2001_02 | _2001_01 | _2000_12 | _2000_11 | _2000_10 | _2000_09 | _2000_08 | _2000_07 | _2000_06 | _2000_05 | _2000_04 | _2000_03 | _2000_02 | _2000_01 | _1999_12 | _1999_11 | _1999_10 | _1999_09 | _1999_08 | _1999_07 | _1999_06 | _1999_05 | _1999_04 | _1999_03 | _1999_02 | _1999_01 | _1998_12 | _1998_11 | _1998_10 | _1998_09 | _1998_08 | _1998_07 | _1998_06 | _1998_05 | _1998_04 | _1998_03 | _1998_02 | _1998_01 | _1997_12 | _1997_11 | _1997_10 | _1997_09 | _1997_08 | _1997_07 | _1997_06 | _1997_05 | _1997_04 | _1997_03 | _1997_02 | _1997_01 | _1996_12 | _1996_11 | _1996_10 | _1996_09 | _1996_08 | _1996_07 | _1996_06 | _1996_05 | _1996_04 | _1996_03 | _1996_02 | _1996_01 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Règlements de compte entre malfaiteurs | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | Homicides pour voler et à l'occasion de vols | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | Homicides pour d'autres motifs | 0 | 3 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 4 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 1 | 0 | 0 | 1 | 0 | 2 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 6 | 0 | 2 | 0 | 2 | 0 | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 1 | 0 | 0 | 4 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 2 | 0 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 1 | 2 | 1 | 0 | 0 | 1 | 0 | 0 | 2 | 0 | 0 | 1 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 3 | 0 | 1 | 1 | 1 | 0 | 1 | 0 | 3 | 0 | 0 | 0 |
| 4 | Tentatives d'homicides pour voler et à l'occasion de vols | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| 5 | Tentatives homicides pour d'autres motifs | 0 | 4 | 0 | 1 | 1 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 3 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 3 | 0 | 1 | 0 | 3 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 1 | 0 | 2 | 0 | 1 | 0 | 1 | 3 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 2 | 5 | 2 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 3 | 1 | 3 | 4 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 2 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 2 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 2 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 3 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 7 | 0 | 0 | 5 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 3 | 1 | 0 | 1 | 4 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 1 | 1 | 2 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 4 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 2 | 2 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 2 | 1 | 0 | 3 | 1 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 2 | 2 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | 2 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 0 |
| 6 | Coups et blessures volontaires suivis de mort | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 3 | 0 | 1 | 0 | 1 | 0 | 0 |
This dataset is our main one, it tells us for example, that in the department “01” in August 2022, there was 1 case of counterfeit currency. we will be able to calculate the total number of crimes happening each month by department.
Source: Crime by state
2.1.1.3 Unemployement
| Libellé | idBank | Dernière.mise.à.jour | Période | X1982.T1 | X1982.T2 | X1982.T3 | X1982.T4 | X1983.T1 | X1983.T2 | X1983.T3 | X1983.T4 | X1984.T1 | X1984.T2 | X1984.T3 | X1984.T4 | X1985.T1 | X1985.T2 | X1985.T3 | X1985.T4 | X1986.T1 | X1986.T2 | X1986.T3 | X1986.T4 | X1987.T1 | X1987.T2 | X1987.T3 | X1987.T4 | X1988.T1 | X1988.T2 | X1988.T3 | X1988.T4 | X1989.T1 | X1989.T2 | X1989.T3 | X1989.T4 | X1990.T1 | X1990.T2 | X1990.T3 | X1990.T4 | X1991.T1 | X1991.T2 | X1991.T3 | X1991.T4 | X1992.T1 | X1992.T2 | X1992.T3 | X1992.T4 | X1993.T1 | X1993.T2 | X1993.T3 | X1993.T4 | X1994.T1 | X1994.T2 | X1994.T3 | X1994.T4 | X1995.T1 | X1995.T2 | X1995.T3 | X1995.T4 | X1996.T1 | X1996.T2 | X1996.T3 | X1996.T4 | X1997.T1 | X1997.T2 | X1997.T3 | X1997.T4 | X1998.T1 | X1998.T2 | X1998.T3 | X1998.T4 | X1999.T1 | X1999.T2 | X1999.T3 | X1999.T4 | X2000.T1 | X2000.T2 | X2000.T3 | X2000.T4 | X2001.T1 | X2001.T2 | X2001.T3 | X2001.T4 | X2002.T1 | X2002.T2 | X2002.T3 | X2002.T4 | X2003.T1 | X2003.T2 | X2003.T3 | X2003.T4 | X2004.T1 | X2004.T2 | X2004.T3 | X2004.T4 | X2005.T1 | X2005.T2 | X2005.T3 | X2005.T4 | X2006.T1 | X2006.T2 | X2006.T3 | X2006.T4 | X2007.T1 | X2007.T2 | X2007.T3 | X2007.T4 | X2008.T1 | X2008.T2 | X2008.T3 | X2008.T4 | X2009.T1 | X2009.T2 | X2009.T3 | X2009.T4 | X2010.T1 | X2010.T2 | X2010.T3 | X2010.T4 | X2011.T1 | X2011.T2 | X2011.T3 | X2011.T4 | X2012.T1 | X2012.T2 | X2012.T3 | X2012.T4 | X2013.T1 | X2013.T2 | X2013.T3 | X2013.T4 | X2014.T1 | X2014.T2 | X2014.T3 | X2014.T4 | X2015.T1 | X2015.T2 | X2015.T3 | X2015.T4 | X2016.T1 | X2016.T2 | X2016.T3 | X2016.T4 | X2017.T1 | X2017.T2 | X2017.T3 | X2017.T4 | X2018.T1 | X2018.T2 | X2018.T3 | X2018.T4 | X2019.T1 | X2019.T2 | X2019.T3 | X2019.T4 | X2020.T1 | X2020.T2 | X2020.T3 | X2020.T4 | X2021.T1 | X2021.T2 | X2021.T3 | X2021.T4 | X2022.T1 | X2022.T2 | X2022.T3 | X2022.T4 | X2023.T1 | X2023.T2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Taux de chômage localisé par région - France métropolitaine | 1515842 | 29/09/2023 12:00 | NA | 6.7 | 6.8 | 7.0 | 7.0 | 7.0 | 7.0 | 7.2 | 7.5 | 7.9 | 8.3 | 8.5 | 8.7 | 8.8 | 8.8 | 8.8 | 8.8 | 8.7 | 8.8 | 8.8 | 8.9 | 9.0 | 9.0 | 8.8 | 8.8 | 8.7 | 8.6 | 8.5 | 8.3 | 8.1 | 7.9 | 7.8 | 7.8 | 7.7 | 7.7 | 7.6 | 7.6 | 7.6 | 7.7 | 8.0 | 8.2 | 8.4 | 8.6 | 8.8 | 9.0 | 9.2 | 9.5 | 9.8 | 10.1 | 10.3 | 10.4 | 10.3 | 10.1 | 9.8 | 9.6 | 9.5 | 9.6 | 9.9 | 10.1 | 10.2 | 10.2 | 10.3 | 10.3 | 10.3 | 10.1 | 9.9 | 9.8 | 9.7 | 9.8 | 9.8 | 9.8 | 9.5 | 9.1 | 8.7 | 8.3 | 8.0 | 7.7 | 7.4 | 7.3 | 7.3 | 7.4 | 7.5 | 7.5 | 7.5 | 7.6 | 8.0 | 8.1 | 8.0 | 8.4 | 8.6 | 8.4 | 8.5 | 8.5 | 8.3 | 8.5 | 8.7 | 8.7 | 8.8 | 8.6 | 8.6 | 8.0 | 8.2 | 7.8 | 7.7 | 7.1 | 6.9 | 7.0 | 7.1 | 7.4 | 8.2 | 8.9 | 8.9 | 9.1 | 9.0 | 8.9 | 8.9 | 8.8 | 8.8 | 8.7 | 8.9 | 9.0 | 9.1 | 9.4 | 9.4 | 9.8 | 10.0 | 10.1 | 9.9 | 9.8 | 9.8 | 9.8 | 9.9 | 10.1 | 10.0 | 10.2 | 10.0 | 9.9 | 9.9 | 9.7 | 9.6 | 9.8 | 9.3 | 9.2 | 9.2 | 8.7 | 9.0 | 8.8 | 8.6 | 8.4 | 8.5 | 8.1 | 8.1 | 7.9 | 7.7 | 7.0 | 8.8 | 7.9 | 8.0 | 7.7 | 7.7 | 7.2 | 7.1 | 7.1 | 7.1 | 7.0 | 6.9 | 6.9 |
| Codes | NA | NA | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | SD | |
| Taux de chômage localisé par région - Île-de-France | 1515843 | 29/09/2023 12:00 | NA | 5.6 | 5.7 | 5.7 | 5.7 | 5.6 | 5.6 | 5.8 | 6.0 | 6.2 | 6.5 | 6.7 | 6.8 | 6.9 | 7.0 | 7.2 | 7.3 | 7.3 | 7.3 | 7.3 | 7.4 | 7.5 | 7.5 | 7.5 | 7.4 | 7.4 | 7.3 | 7.3 | 7.1 | 6.9 | 6.7 | 6.6 | 6.5 | 6.5 | 6.5 | 6.4 | 6.4 | 6.4 | 6.6 | 6.9 | 7.0 | 7.1 | 7.3 | 7.6 | 7.9 | 8.2 | 8.6 | 8.9 | 9.2 | 9.3 | 9.3 | 9.2 | 9.1 | 8.9 | 8.8 | 8.7 | 8.8 | 9.0 | 9.2 | 9.2 | 9.2 | 9.3 | 9.4 | 9.4 | 9.2 | 9.1 | 9.0 | 8.9 | 8.9 | 8.9 | 8.8 | 8.5 | 8.2 | 7.8 | 7.4 | 7.1 | 6.8 | 6.5 | 6.4 | 6.5 | 6.6 | 6.8 | 7.1 | 7.3 | 7.4 | 7.9 | 8.1 | 8.1 | 8.4 | 8.7 | 8.5 | 8.6 | 8.6 | 8.3 | 8.4 | 8.5 | 8.4 | 8.4 | 8.2 | 8.1 | 7.5 | 7.7 | 7.4 | 7.2 | 6.6 | 6.3 | 6.4 | 6.4 | 6.5 | 7.1 | 7.6 | 7.8 | 8.1 | 8.0 | 7.9 | 7.9 | 7.9 | 7.8 | 7.8 | 7.9 | 8.0 | 8.1 | 8.3 | 8.3 | 8.5 | 8.7 | 8.8 | 8.7 | 8.7 | 8.7 | 8.7 | 8.8 | 9.0 | 8.8 | 9.0 | 8.9 | 8.8 | 8.8 | 8.6 | 8.5 | 8.7 | 8.3 | 8.2 | 8.3 | 7.8 | 8.1 | 8.0 | 7.8 | 7.7 | 7.6 | 7.4 | 7.4 | 7.2 | 7.0 | 6.3 | 8.3 | 7.7 | 7.8 | 7.5 | 7.6 | 7.1 | 7.0 | 7.0 | 6.9 | 6.8 | 6.6 | 6.7 |
| Codes | NA | NA | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | SD | |
| Taux de chômage localisé par région - Centre-Val de Loire | 1515847 | 29/09/2023 12:00 | NA | 5.7 | 5.7 | 5.9 | 6.0 | 6.0 | 6.0 | 6.3 | 6.7 | 7.1 | 7.4 | 7.7 | 8.0 | 8.0 | 8.0 | 8.1 | 8.0 | 8.1 | 8.2 | 8.1 | 8.3 | 8.4 | 8.4 | 8.2 | 8.0 | 7.9 | 7.7 | 7.6 | 7.4 | 7.2 | 7.0 | 6.8 | 6.8 | 6.6 | 6.4 | 6.4 | 6.4 | 6.4 | 6.5 | 6.8 | 7.0 | 7.3 | 7.5 | 7.6 | 7.8 | 8.0 | 8.4 | 8.7 | 9.0 | 9.2 | 9.2 | 9.2 | 8.9 | 8.6 | 8.3 | 8.1 | 8.3 | 8.6 | 8.8 | 9.0 | 9.1 | 9.2 | 9.0 | 8.9 | 8.7 | 8.5 | 8.4 | 8.3 | 8.3 | 8.3 | 8.2 | 7.9 | 7.5 | 7.1 | 6.7 | 6.5 | 6.2 | 6.0 | 5.9 | 5.9 | 6.1 | 6.1 | 6.2 | 6.3 | 6.4 | 6.7 | 7.0 | 7.0 | 7.4 | 7.6 | 7.4 | 7.5 | 7.6 | 7.4 | 7.5 | 7.7 | 7.7 | 7.8 | 7.6 | 7.6 | 7.1 | 7.1 | 6.7 | 6.7 | 6.2 | 5.9 | 6.1 | 6.3 | 6.6 | 7.5 | 8.1 | 8.1 | 8.4 | 8.4 | 8.3 | 8.3 | 8.2 | 8.1 | 8.0 | 8.3 | 8.4 | 8.5 | 8.8 | 9.0 | 9.3 | 9.5 | 9.6 | 9.5 | 9.4 | 9.4 | 9.5 | 9.6 | 9.8 | 9.6 | 9.8 | 9.7 | 9.6 | 9.7 | 9.5 | 9.4 | 9.5 | 9.1 | 9.0 | 8.9 | 8.4 | 8.7 | 8.5 | 8.5 | 8.3 | 8.4 | 8.0 | 8.0 | 7.8 | 7.4 | 6.8 | 8.2 | 7.2 | 7.4 | 7.2 | 7.2 | 6.7 | 6.8 | 6.9 | 6.8 | 6.7 | 6.6 | 6.7 |
| Codes | NA | NA | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | A | SD |
Source: Unemployement by state
2.1.1.4 Middle school final exam
| Session | Numero.d.etablissement | Type.d.etablissement | Patronyme | Secteur.d.enseignement | Commune | Libellé.commune | Code.département | Libellé.département | Code.académie | Libellé.académie | Code.région | Libellé.région | Inscrits | Presents | Admis | Admis.sans.mention | Nombre_d_admis_Mention_AB | Admis.Mention.bien | Admis.Mention.très.bien | Taux.de.réussite |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2008 | 0381810L | COLLEGE | GERARD PHILIPE | PUBLIC | 38169 | FONTAINE | 038 | ISERE | 8 | GRENOBLE | 1 | AUVERGNE-RHONE-ALPES | 97 | 94 | 57 | 20 | 20 | 16 | 1 | 60,60% |
| 2008 | 0381818V | LYCEE PROFESSIONNEL | - | PUBLIC | 38563 | VOIRON | 038 | ISERE | 8 | GRENOBLE | 1 | AUVERGNE-RHONE-ALPES | 19 | 19 | 19 | 13 | 5 | 1 | 0 | 100,00% |
| 2008 | 0381888W | LYCEE PROFESSIONNEL | - | PUBLIC | 38509 | LA TOUR-DU-PIN | 038 | ISERE | 8 | GRENOBLE | 1 | AUVERGNE-RHONE-ALPES | 24 | 24 | 22 | 11 | 8 | 3 | 0 | 91,60% |
| 2008 | 0382099A | LYCEE | ROGER DESCHAUX | PUBLIC | 38474 | SASSENAGE | 038 | ISERE | 8 | GRENOBLE | 1 | AUVERGNE-RHONE-ALPES | 24 | 22 | 11 | 4 | 7 | 0 | 0 | 50,00% |
| 2008 | 0382100B | COLLEGE | GRESIVAUDAN | PUBLIC | 38397 | SAINT-ISMIER | 038 | ISERE | 8 | GRENOBLE | 1 | AUVERGNE-RHONE-ALPES | 193 | 190 | 174 | 38 | 57 | 45 | 34 | 91,50% |
| 2008 | 0382104F | COLLEGE | LOUIS ARAGON | PUBLIC | 38553 | VILLEFONTAINE | 038 | ISERE | 8 | GRENOBLE | 1 | AUVERGNE-RHONE-ALPES | 90 | 86 | 53 | 33 | 9 | 8 | 3 | 61,60% |
At the end of middle school in France, sutdents must complete an exam and if they succed, the students will receive their “brevet des collèges” and can continue to high school. This would be similar to the “Certificat” in Switzerland. Here this database gives us results for each middle school in France. The first column is exam year. Column 2 to 13 tell us where and what type of school it was. After column 13, we find how many students registered, how many came and how many students obtained the “Brevet”. Lastly, we get more information regarding how well the students did. A “mention” can be obtained if the students acheive a certain average grade. Lastly, the success rate is calculated by dividing the number of students who obtained the diploma by the number of students who came the day of the exam. We will not use this sucess rate as we believe that computing the success rate by diving students who obtained the degree with registered students would give us more information regarding education levels in France.
2.1.1.5 Election Results
| Code du département | Libellé du département | Inscrits | Abstentions | % Abs/Ins | Votants | % Vot/Ins | Blancs | % Blancs/Ins | % Blancs/Vot | Nuls | % Nuls/Ins | % Nuls/Vot | Exprimés | % Exp/Ins | % Exp/Vot | Sexe...17 | Nom...18 | Prénom...19 | Voix...20 | % Voix/Ins...21 | % Voix/Exp...22 | Sexe...23 | Nom...24 | Prénom...25 | Voix...26 | % Voix/Ins...27 | % Voix/Exp...28 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Ain | 415950 | 93130 | 22.39 | 322820 | 77.61 | 28852 | 6.94 | 8.94 | 8738 | 2.10 | 2.71 | 285230 | 68.57 | 88.36 | M | MACRON | Emmanuel | 173809 | 41.79 | 60.94 | F | LE PEN | Marine | 111421 | 26.79 | 39.06 |
| 2 | Aisne | 375791 | 90745 | 24.15 | 285046 | 75.85 | 22838 | 6.08 | 8.01 | 9067 | 2.41 | 3.18 | 253141 | 67.36 | 88.81 | M | MACRON | Emmanuel | 119202 | 31.72 | 47.09 | F | LE PEN | Marine | 133939 | 35.64 | 52.91 |
| 3 | Allier | 253479 | 59294 | 23.39 | 194185 | 76.61 | 18877 | 7.45 | 9.72 | 8522 | 3.36 | 4.39 | 166786 | 65.80 | 85.89 | M | MACRON | Emmanuel | 106579 | 42.05 | 63.90 | F | LE PEN | Marine | 60207 | 23.75 | 36.10 |
| 4 | Alpes-de-Haute-Provence | 126459 | 29255 | 23.13 | 97204 | 76.87 | 9671 | 7.65 | 9.95 | 3722 | 2.94 | 3.83 | 83811 | 66.28 | 86.22 | M | MACRON | Emmanuel | 48994 | 38.74 | 58.46 | F | LE PEN | Marine | 34817 | 27.53 | 41.54 |
| 5 | Hautes-Alpes | 109892 | 24895 | 22.65 | 84997 | 77.35 | 8670 | 7.89 | 10.20 | 2699 | 2.46 | 3.18 | 73628 | 67.00 | 86.62 | M | MACRON | Emmanuel | 47211 | 42.96 | 64.12 | F | LE PEN | Marine | 26417 | 24.04 | 35.88 |
| 6 | Alpes-Maritimes | 761780 | 198631 | 26.07 | 563149 | 73.93 | 47784 | 6.27 | 8.49 | 12414 | 1.63 | 2.20 | 502951 | 66.02 | 89.31 | M | MACRON | Emmanuel | 278407 | 36.55 | 55.35 | F | LE PEN | Marine | 224544 | 29.48 | 44.65 |
We take this sheet for the second round of election results in 2017. This second round opposed Emmanuel Macron to Marine Lepen. In this, dataset, we have the number of citizens, how many voted, blank ballots, invalid ballots and ballots for each candidate. We also have some basic information on the candidate, such as name and gender. Lastly we also have various percentages, with one being of great interest, the percentage of votes that Marine Le Pen received. We decided to take the 2017 election because with our year being 2019, seeing the “impact” of the last election on crime could yield interesting results.
Source: 2017 Election results by state
2.1.1.6 Immigration
| ...1 | Immigrés | Non immigrés | Ensemble...4 | Etrangers | Français | Ensemble...7 |
|---|---|---|---|---|---|---|
| Ain | 78968 | 573464 | 652432 | 592775 | 59657 | 652432 |
| Aisne | 25414 | 505931 | 531345 | 512545 | 18800 | 531345 |
| Allier | 17423 | 318552 | 335975 | 324064 | 11911 | 335975 |
| Alpes-de-Haute-Provence | 13309 | 150999 | 164308 | 155227 | 9081 | 164308 |
| Hautes-Alpes | 8192 | 133028 | 141220 | 135854 | 5366 | 141220 |
| Alpes-Maritimes | 162232 | 932051 | 1094283 | 972043 | 122240 | 1094283 |
Here we take this sheet and look at the number of immigrants by state. According to the documentation, an immigrant is a person born outside of France and living in France, while a foreigner is someone who does not have the french nationality. It should be noted that the label for foreigners and French citizens appears to have been switched in the dataframe. Fortunately, we will only use the number of immigrants for our study.
2.1.1.7 Population Density
| Département | ...2 | Densité |
|---|---|---|
| 01 | Ain | 113 |
| 02 | Aisne | 72 |
| 03 | Allier | 46 |
| 04 | Alpes-de-Haute-Provence | 24 |
| 05 | Hautes-Alpes | 25 |
| 06 | Alpes-Maritimes | 255 |
Source: Population Density by state
2.1.2 Data cleaning
2.1.2.1 List of departements.
We create a list of states with their numbers using the population dataset.
Code
List_depart <- List_depart[5:100, 1:2]
colnames(List_depart) <- c("Dep_number", "Dep_name")To do this, we select our states and only the name and state code from our Population dataset. For reasons stated above, we only select metropolitan states. This means only selecting the first 96 states. We then rename our columns with easy to remember names that will be used as our key in the future.
2.1.2.2 Population by year and department
We will create 2 datasets.
Code
years_pop <- as.character(1975:2023) #every year we need to extract a sheet
Pop_by_year <- List_depart[1] #list with depart numbers
Col_name <- paste0("pop_", years_pop) #List to rename the columns
for (i in 1:length(years_pop)) {
pop <-
read_excel(here::here("Raw_data/population par département.xls"),
sheet = years_pop[i])
Pop_by_year <- cbind(Pop_by_year, pop[5:100, 8])#add them to the df
colnames(Pop_by_year)[i+1] <- Col_name[i] #rename cols
}our first dataset is the population by years. To create this, we create a vector of all the years present in the excel file. using a for loop, we open each excel sheet for each year, add them to our list of departments dataset created previously and rename it to a standard name. This dataset will be useful for our graphical representations.
we then, we take this dataset and only keep the population in 2019 and the department number. We will use this for our clustering analysis.
Code
Pop_2019 <- Pop_by_year |>
select(Dep_number, pop_2019)
View(Pop_2019)2.1.2.3 Crime
Code
years_crime <- 1996:2021
Crime_total <- t(data.frame(years_crime))
colnames(Crime_total) <- Crime_total[1,]
view(Crime_total)
for (i in 1:nrow(List_depart)) {
crimes_depart <- read_excel(here::here("Raw_data/crimes_depart.xlsx"),
sheet = List_depart$Dep_number[i])
#Load each datasheet with the department list
crimes_depart <- crimes_depart |>
select(-(1:10)) |>
summarise_all(sum) |>
pivot_longer(cols = everything(), names_to = "Year", values_to = "Dep_number") |>
mutate(Year = substr(Year, 2, 5)) |>
group_by(Year) |>
summarize(Dep_number = sum(Dep_number)) |> t()
Crime_total <- rbind(Crime_total, crimes_depart[2,])
}
Crime_total <- Crime_total[-1, ]
Crime_total <- cbind(List_depart[1], Crime_total)We create a total number of crimes committed by department and by year.
Code
Crime_2019 <- Crime |>
select(Dep_number, `2019`) |>
rename(Crime_tot_2019 = `2019`)
Crime_2019 <- full_join(Crime_2019, Pop_2019, join_by("Dep_number"))
View(Crime_2019)
Crime_2019 <- Crime_2019 |>
mutate(Crime_rate_1k = (Crime_tot_2019/pop_2019)*1000)We then only select the year 2019 and compute the crime rate per thousand people. To do this we must add pur previously created population in 2019 dataframe.
2.1.2.4 Unemployment
This dataset is a bit trickier.
Code
dummy <- seq_len(nrow(Unemployement)) %% 2 #Dummy variable equal to 1 if the row is an odd number
Unemployement <- cbind(dummy, Unemployement)
Unemployement <- Unemployement |>
filter(dummy == 1) |>
select(-c(1, 3,4,5))
Unemployement <- Unemployement[-(1:15),] # only select departments
Unemployement <- Unemployement |>
separate(Libellé, into = c("X", "Departement") ,sep=" - ") |>
select(-1)
Unemployement <- Unemployement[-(97:100), ]First we need to remove every odd number row as they do not provide us with any information. We create a dummy variable that equals 1 if the row is an odd number, we then only select our rows with data. We then only keep the department name from the department name. We then only select metropolitan departments
Code
Unemployment_quarterly <- Unemployement |>
pivot_longer(cols = -Dep_name, names_to = "Year", values_to = "Total") |>
mutate(Total = as.numeric(Total)) |>
mutate(Year = substr(Year, 2, 8)) |>
group_by(Dep_name, Year) |>
summarize(Total= mean(Total)) |>
pivot_wider(names_from = Year, values_from = Total)
Unemployment_quarterly <- full_join(List_depart, Unemployment_quarterly, join_by(Dep_name))We now need to compute the quarterly unemployment rate. To do this we use pivot_longer to be able to summarize our data by year and department. we then compute a mean unemployment rate for each year and rearrange our departments names to be inline with our list of departments using the full_join() function.
Code
Unemployment_year <- Unemployment_quarterly |>
pivot_longer(cols = -(Dep_number:Dep_name), names_to = "Year", values_to = "Total") |>
mutate(Year = substr(Year, 1, 4)) |>
group_by(Dep_number, Dep_name, Year) |>
summarize(Total = mean(Total)) |>
pivot_wider(names_from = Year, values_from = Total)We now compute the Unemployment by year by computing the mean of each quarter.
Code
Unemp <- Unemployment_year |>
select(Departement, `2019`) |>
rename(Dep_name = Departement,
Unemp_2019 = `2019`)And we select only the year 2019 for our clustering and PCA analysis.
2.1.2.5 Middle school results
Code
Middle <- Middle_results |>
filter(Session == 2019) |>
select(Code.département, Admis, Inscrits) |>
mutate(Pass_rate = (Admis/Inscrits)) |>
group_by(Code.département) |>
summarize(Pass_rate_2019 = mean(Pass_rate)) |>
rename(Dep_number = Code.département) |>
mutate(Dep_number = substr(Dep_number, 2, 3))
Middle <- Middle[(2:97), ]Here we start by only taking the year 2019, we then only select useful columns, we then compute the Pass_rate which is the percentage of students who obtained the certificate. We then change the department number to be inline with our naming convention. Lastly, we select metropolitan departments.
2.1.2.6 Election results
Code
Election_by_dep <- Election_dep |>
select(c(1,28)) |>
rename(Dep_number = `Code du département`,
Lepen_score = `% Voix/Exp...28`) |>
mutate(Win_lepen = ifelse(Lepen_score>50, 1, 0))
Election_by_dep <- Election_by_dep[1:96, ]
#rename dep_number
for (i in 1:nrow(Election_by_dep)) {
if(nchar(Election_by_dep$Dep_number[i]) == 1) {
Election_by_dep$Dep_number[i] <- paste0("0", Election_by_dep$Dep_number[i])
}
}For our Election results, we select the columns we need. Then, we select metropolitan departments. Then we need to change our department number to our naming convention to be able to use it as a key. As a reminder, our naming convention is to write single digit department with a leading zero.
2.1.2.7 Immigration
Code
Immig_2019 <- Immigration_2019 |>
slice(1:96) |>
select(c(1, 2, 4)) |>
rename(Dep_name = `...1`,
Immig_tot = Immigrés,
pop_2019 = `Ensemble...4`) |>
mutate(Immig_rate = Immig_tot/pop_2019) |>
select(Dep_name, Immig_rate)Immigration is quite simple, we select only metropolitan departments with the slice() function. We then select columns that we need, rename them in english and compute the immigration rate which is Total number of immigrants divided by the total population.
2.1.2.8 Population density
Code
Dens_2019 <- population_2019 |>
rename(Dep_number = Département,
Density_2019 = Densité) |>
select(Dep_number, Density_2019) |>
slice(1:96)Similarly, population density is quite easy. we rename our selected columns to our naming convention, we then select only metropolitan departments using the slice() function.
2.1.2.9 Join them into a single dataset
Our last step is to join all of our previously cleaned datasets into a single dataset
Code
Full_data_dep <- full_join(List_depart, Unemp, join_by("Dep_name")) |>
full_join(Crime_2019, join_by("Dep_number")) |>
full_join(Middle, join_by("Dep_number")) |>
full_join(Election_by_dep, join_by("Dep_number")) |>
full_join(Immig_2019, join_by("Dep_name")) |>
full_join(Dens_2019, join_by("Dep_number"))2.2 Data by town
2.2.1 Raw datasets
2.2.1.1 Crime
| CODGEO_2023 | annee | classe | unité.de.compte | valeur.publiée | faits | tauxpourmille | complementinfoval | complementinfotaux | POP | millPOP | LOG | millLOG |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01001 | 16 | Autres coups et blessures volontaires | victime | ndiff | NA | NA | 1,4139534883720930 | 0,8994774745986377 | 767 | 16 | 3,48000000000000e+02 | 16 |
| 01001 | 17 | Autres coups et blessures volontaires | victime | ndiff | NA | NA | 1,4744186046511627 | 0,9308908948886161 | 776 | 17 | 3,53730434782610e+02 | 17 |
| 01001 | 18 | Autres coups et blessures volontaires | victime | ndiff | NA | NA | 1,6744186046511629 | 1,0494250025506857 | 771 | 18 | 3,53730434782608e+02 | 18 |
| 01001 | 19 | Autres coups et blessures volontaires | victime | diff | 0 | 0,00000000000000e+00 | NA | NA | 779 | 19 | 3,59574601708852e+02 | 19 |
| 01001 | 20 | Autres coups et blessures volontaires | victime | ndiff | NA | NA | 1,4870689655172413 | 0,9798018812196120 | 806 | 20 | 3,66185064822065e+02 | 20 |
| 01001 | 21 | Autres coups et blessures volontaires | victime | ndiff | NA | NA | 1,7098214285714286 | 1,1468851437948424 | 806 | 20 | 3,66185064822065e+02 | 20 |
| 01001 | 22 | Autres coups et blessures volontaires | victime | ndiff | NA | NA | 1,8088888888888890 | 1,2238135473453471 | 806 | 20 | 3,66185064822065e+02 | 20 |
| 01004 | 16 | Autres coups et blessures volontaires | victime | diff | 39 | 2,76968965272353e+00 | NA | NA | 14081 | 16 | 7,12611602800044e+03 | 16 |
| 01004 | 17 | Autres coups et blessures volontaires | victime | diff | 38 | 2,70751692198076e+00 | NA | NA | 14035 | 17 | 7,21711239192120e+03 | 17 |
| 01004 | 18 | Autres coups et blessures volontaires | victime | diff | 46 | 3,23852435933540e+00 | NA | NA | 14204 | 18 | 7,34158489570647e+03 | 18 |
This dataset is quite big with almost 3 million rows. The first column is the CODGEO, this is a towncode by the INSEE(French institute of statistics and economics studies) the first 2 numbers are the department that the town is located in and the last 3 numbers are the town number. This first column will be very useful because it is a unique number for each town and we will use it as our key. The second column is our year, the third is the type of crime commited. The fourth column “valeur.publiée” tells us if the town publishes their crime statistics or not. It should be noted that some town are coded as publishing although the value is missing. We will dive into this later in the EDA part of our report. We then have a column telling us the total number of crimes and one for crime rate for a thousand people. The remaining columns will not be used in our analysis. The remaining LOG variables are for households in the town and the mill is a “millésime” method of census that allows a census to be conducted every year at a smaller scale, this is useful to track changes in the population year by year, but it will not be used by us.
Lastly, the rules for publishing data are the following: crime data is only published if the town records more than 5 instances of crime for 3 consecutive years. This is done to protect the privacy of the people involved and because the statistical analysis of such small data would not yield satisfying results.
Source: Crime by town
2.2.1.2 Density
| X1 | X2 | X3 | X4 | X5 | X6 | X7 | X8 | X9 | X10 | X11 | X12 | X13 | X14 | X15 | X16 | X17 | X18 | X19 | X20 | X21 | X22 | X23 | X24 | X25 | X26 | X27 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 01 | ozan | OZAN | ozan | Ozan | O250 | OSN | 01190 | 284 | 01284 | 2 | 26 | 6 | 618 | 469 | 500 | 93 | 6.60 | 4.917 | 46.38 | 2866 | 51546 | +45456 | 462330 | 170 | 205 |
| 2 | 01 | cormoranche-sur-saone | CORMORANCHE-SUR-SAONE | cormoranche sur saone | Cormoranche-sur-Saône | C65652625 | KRMRNXSRSN | 01290 | 123 | 01123 | 2 | 27 | 6 | 1058 | 903 | 1000 | 107 | 9.85 | 4.833 | 46.23 | 2772 | 51379 | +44953 | 461427 | 168 | 211 |
| 3 | 01 | plagne-01 | PLAGNE | plagne | Plagne | P425 | PLKN | 01130 | 298 | 01298 | 4 | 03 | 6 | 129 | 83 | 100 | 20 | 6.20 | 5.733 | 46.18 | 3769 | 51324 | +54342 | 461131 | 560 | 922 |
| 4 | 01 | tossiat | TOSSIAT | tossiat | Tossiat | T230 | TST | 01250 | 422 | 01422 | 2 | 25 | 6 | 1406 | 1111 | 1400 | 138 | 10.17 | 5.317 | 46.13 | 3309 | 51268 | +51854 | 460828 | 244 | 501 |
| 5 | 01 | pouillat | POUILLAT | pouillat | Pouillat | P430 | PLT | 01250 | 309 | 01309 | 2 | 33 | 6 | 88 | 58 | 100 | 14 | 6.23 | 5.433 | 46.33 | 3435 | 51475 | +52542 | 461938 | 333 | 770 |
| 6 | 01 | torcieu | TORCIEU | torcieu | Torcieu | T620 | TRS | 01230 | 421 | 01421 | 1 | 28 | 6 | 698 | 643 | 700 | 65 | 10.72 | 5.400 | 45.92 | 3398 | 51025 | +52343 | 455521 | 257 | 782 |
This dataframe contains various information regarding french towns. Here, can be found the size of each town, the population but also the altitude for example. Here we will use the population in 2010 and the size to compute the density.
Source: Density by town
2.2.1.3 Population by town
| NIVGEO | CODGEO | LIBGEO | SEXE | AGEPYR10 | NB |
|---|---|---|---|---|---|
| COM | 01001 | L'Abergement-Clémenciat | 1 | 00 | 14.813 |
| COM | 01001 | L'Abergement-Clémenciat | 2 | 00 | 13.905 |
| COM | 01001 | L'Abergement-Clémenciat | 1 | 03 | 10.848 |
| COM | 01001 | L'Abergement-Clémenciat | 2 | 03 | 6.856 |
| COM | 01001 | L'Abergement-Clémenciat | 1 | 06 | 23.656 |
| COM | 01001 | L'Abergement-Clémenciat | 2 | 06 | 32.462 |
This dataset is quite simple, we have a column indicating the level, here is is towns, the CODGEO mentioned above, the town name, followed by the gender and age group. finally we have the number of inhabitants that fit these criterias.
Source: Density by town
2.2.1.4 Population by diploma
| Région en géographie courante | Département en géographie courante | Commune en géographie courante | Indicateur de stabilité de la commune | Département en géographie 2021 | Libellé de commune | Aucun diplôme Hommes 16 à 24 ans RP2019 | Aucun diplôme Hommes 25 ans ou plus RP2019 | Aucun diplôme Femmes 16 à 24 ans RP2019 | Aucun diplôme Femmes 25 ans ou plus RP2019 | Diplôme de niveau CEP Hommes 16 à 24 ans RP2019 | Diplôme de niveau CEP Hommes 25 ans ou plus RP2019 | Diplôme de niveau CEP Femmes 16 à 24 ans RP2019 | Diplôme de niveau CEP Femmes 25 ans ou plus RP2019 | Diplôme de niveau BEPC Hommes 16 à 24 ans RP2019 | Diplôme de niveau BEPC Hommes 25 ans ou plus RP2019 | Diplôme de niveau BEPC Femmes 16 à 24 ans RP2019 | Diplôme de niveau BEPC Femmes 25 ans ou plus RP2019 | Diplôme de niveau CAP-BEP Hommes 16 à 24 ans RP2019 | Diplôme de niveau CAP-BEP Hommes 25 ans ou plus RP2019 | Diplôme de niveau CAP-BEP Femmes 16 à 24 ans RP2019 | Diplôme de niveau CAP-BEP Femmes 25 ans ou plus RP2019 | Diplôme de niveau bac général ou technique Hommes 16 à 24 ans RP2019 | Diplôme de niveau bac général ou technique Hommes 25 ans ou plus RP2019 | Diplôme de niveau bac général ou technique Femmes 16 à 24 ans RP2019 | Diplôme de niveau bac général ou technique Femmes 25 ans ou plus RP2019 | Diplôme universitaire de 1er cycle Hommes 16 à 24 ans RP2019 | Diplôme universitaire de 1er cycle Hommes 25 ans ou plus RP2019 | Diplôme universitaire de 1er cycle Femmes 16 à 24 ans RP2019 | Diplôme universitaire de 1er cycle Femmes 25 ans ou plus RP2019 | Diplôme universitaire de 2ème ou 3ème cycle Hommes 16 à 24 ans RP2019 | Diplôme universitaire de 2ème ou 3ème cycle Hommes 25 ans ou plus RP2019 | Diplôme universitaire de 2ème ou 3ème cycle Femmes 16 à 24 ans RP2019 | Diplôme universitaire de 2ème ou 3ème cycle Femmes 25 ans ou plus RP2019 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RR | DR | CR | STABLE | DR21 | LIBELLE | dpx_rec0s1age2_rec1rpop2019 | dpx_rec0s1age2_rec2rpop2019 | dpx_rec0s2age2_rec1rpop2019 | dpx_rec0s2age2_rec2rpop2019 | dpx_rec1s1age2_rec1rpop2019 | dpx_rec1s1age2_rec2rpop2019 | dpx_rec1s2age2_rec1rpop2019 | dpx_rec1s2age2_rec2rpop2019 | dpx_rec2s1age2_rec1rpop2019 | dpx_rec2s1age2_rec2rpop2019 | dpx_rec2s2age2_rec1rpop2019 | dpx_rec2s2age2_rec2rpop2019 | dpx_rec3s1age2_rec1rpop2019 | dpx_rec3s1age2_rec2rpop2019 | dpx_rec3s2age2_rec1rpop2019 | dpx_rec3s2age2_rec2rpop2019 | dpx_rec4s1age2_rec1rpop2019 | dpx_rec4s1age2_rec2rpop2019 | dpx_rec4s2age2_rec1rpop2019 | dpx_rec4s2age2_rec2rpop2019 | dpx_rec5s1age2_rec1rpop2019 | dpx_rec5s1age2_rec2rpop2019 | dpx_rec5s2age2_rec1rpop2019 | dpx_rec5s2age2_rec2rpop2019 | dpx_rec6s1age2_rec1rpop2019 | dpx_rec6s1age2_rec2rpop2019 | dpx_rec6s2age2_rec1rpop2019 | dpx_rec6s2age2_rec2rpop2019 |
| 84 | 01 | 001 | 1 | 01 | Abergement-Clémenciat | 0 | 31.273727679978059 | 0 | 26.319004906244121 | 0 | 36.504516945208763 | 0 | 42.168740366590413 | 4.8204307590319528 | 25.195300294679452 | 0 | 25.897214253562552 | 0 | 72.042804073821756 | 10.047478793216261 | 66.528046044671314 | 9.9094455474678735 | 51.11132775508888 | 4.9547227737339368 | 50.935251110259628 | 10.185512038964646 | 25.624888993112183 | 0 | 35.403682355195734 | 0 | 45.715893991511749 | 0 | 30.43378047534657 |
| 84 | 01 | 002 | 1 | 01 | Abergement-De-Varey | 0 | 0 | 0 | 5.5320444734237153 | 0 | 5.1374280962262988 | 0 | 10.402372543219041 | 0 | 5.2011862716095205 | 0 | 0 | 0 | 31.540359468699528 | 0 | 10.466921990976699 | 0 | 31.273259442268557 | 0 | 26.136622618416695 | 0 | 15.73425007519765 | 0 | 10.596821978971352 | 0 | 10.532272531213694 | 0 | 21.129094510185048 |
| 84 | 01 | 003 | 0 | 01 | Amareins | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| 84 | 01 | 004 | 1 | 01 | Ambérieu-En-Bugey | 76.83122693441176 | 643.20453199840733 | 35.464379173014933 | 860.73512425945796 | 0 | 209.96947893832353 | 0 | 432.8132259662828 | 42.914918277527164 | 207.23820326380567 | 52.610753884473795 | 355.55757783628127 | 181.41495713237916 | 1382.3332572324048 | 122.86736219321433 | 1245.913072797086 | 156.88926046556423 | 822.49605569888286 | 188.34449242818874 | 832.16467035277583 | 50.77482939272798 | 638.97964337399492 | 50.046583285767468 | 536.94746995634785 | 39.280654212696781 | 593.41118637421562 | 47.035995912555691 | 837.01126571351983 |
| 84 | 01 | 005 | 1 | 01 | Ambérieux-En-Dombes | 0 | 83.521324354657963 | 0 | 44.217171717171972 | 0 | 19.652076318743237 | 0 | 63.869248035915504 | 9.8260381593715458 | 39.304152637485977 | 0 | 49.130190796858017 | 4.913019079685748 | 240.73793490460261 | 4.913019079685748 | 167.04264870931641 | 4.9130190796857969 | 88.434343434343873 | 9.8260381593715458 | 152.30359147025854 | 0 | 54.043209876543123 | 4.9130190796856503 | 93.347362514029356 | 0 | 93.347362514029072 | 4.913019079685748 | 83.521324354657537 |
In this dataset, column 1 to 6 are the location of the town, with indicators such as the region, the departement and the town number. Then the population is divided by gender and by diploma obtained, we will only focus on the total population over 25 who doesn’t have any diploma. The whole excel files gives us this information for every year starting in 1968. We only take the year 2019.
Source: Population by diploma by town
2.2.1.5 2017 Election results
| Code du département | Libellé du département | Code de la commune | Libellé de la commune | Inscrits | Abstentions | % Abs/Ins | Votants | % Vot/Ins | Blancs | % Blancs/Ins | % Blancs/Vot | Nuls | % Nuls/Ins | % Nuls/Vot | Exprimés | % Exp/Ins | % Exp/Vot | N°Panneau...19 | Sexe...20 | Nom...21 | Prénom...22 | Voix...23 | % Voix/Ins...24 | % Voix/Exp...25 | N°Panneau...26 | Sexe...27 | Nom...28 | Prénom...29 | Voix...30 | % Voix/Ins...31 | % Voix/Exp...32 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Ain | 1 | L'Abergement-Clémenciat | 598 | 100 | 16.72 | 498 | 83.28 | 37 | 6.19 | 7.43 | 8 | 1.34 | 1.61 | 453 | 75.75 | 90.96 | 1 | M | MACRON | Emmanuel | 272 | 45.48 | 60.04 | 2 | F | LE PEN | Marine | 181 | 30.27 | 39.96 |
| 1 | Ain | 2 | L'Abergement-de-Varey | 209 | 32 | 15.31 | 177 | 84.69 | 21 | 10.05 | 11.86 | 6 | 2.87 | 3.39 | 150 | 71.77 | 84.75 | 1 | M | MACRON | Emmanuel | 93 | 44.50 | 62.00 | 2 | F | LE PEN | Marine | 57 | 27.27 | 38.00 |
| 1 | Ain | 4 | Ambérieu-en-Bugey | 8586 | 2312 | 26.93 | 6274 | 73.07 | 601 | 7.00 | 9.58 | 217 | 2.53 | 3.46 | 5456 | 63.55 | 86.96 | 1 | M | MACRON | Emmanuel | 3213 | 37.42 | 58.89 | 2 | F | LE PEN | Marine | 2243 | 26.12 | 41.11 |
| 1 | Ain | 5 | Ambérieux-en-Dombes | 1172 | 259 | 22.10 | 913 | 77.90 | 77 | 6.57 | 8.43 | 24 | 2.05 | 2.63 | 812 | 69.28 | 88.94 | 1 | M | MACRON | Emmanuel | 423 | 36.09 | 52.09 | 2 | F | LE PEN | Marine | 389 | 33.19 | 47.91 |
| 1 | Ain | 6 | Ambléon | 99 | 23 | 23.23 | 76 | 76.77 | 17 | 17.17 | 22.37 | 0 | 0.00 | 0.00 | 59 | 59.60 | 77.63 | 1 | M | MACRON | Emmanuel | 30 | 30.30 | 50.85 | 2 | F | LE PEN | Marine | 29 | 29.29 | 49.15 |
| 1 | Ain | 7 | Ambronay | 1880 | 336 | 17.87 | 1544 | 82.13 | 138 | 7.34 | 8.94 | 31 | 1.65 | 2.01 | 1375 | 73.14 | 89.05 | 1 | M | MACRON | Emmanuel | 774 | 41.17 | 56.29 | 2 | F | LE PEN | Marine | 601 | 31.97 | 43.71 |
This dataset is very similar to the one for departments.
Source: 2017 Election results by town
2.2.1.6 Poverty by town
| CODGEO | TP4019 | TP5019 | TP6019 | TP60Q219 | TP60IP19 | AGE1TP4019 | AGE1TP5019 | AGE1TP6019 | AGE1TP60Q219 | AGE1TP60IP19 | AGE2TP4019 | AGE2TP5019 | AGE2TP6019 | AGE2TP60Q219 | AGE2TP60IP19 | AGE3TP4019 | AGE3TP5019 | AGE3TP6019 | AGE3TP60Q219 | AGE3TP60IP19 | AGE4TP4019 | AGE4TP5019 | AGE4TP6019 | AGE4TP60Q219 | AGE4TP60IP19 | AGE5TP4019 | AGE5TP5019 | AGE5TP6019 | AGE5TP60Q219 | AGE5TP60IP19 | AGE6TP4019 | AGE6TP5019 | AGE6TP6019 | AGE6TP60Q219 | AGE6TP60IP19 | TME1TP4019 | TME1TP5019 | TME1TP6019 | TME1TP60Q219 | TME1TP60IP19 | TME2TP4019 | TME2TP5019 | TME2TP6019 | TME2TP60Q219 | TME2TP60IP19 | TME3TP4019 | TME3TP5019 | TME3TP6019 | TME3TP60Q219 | TME3TP60IP19 | TME4TP4019 | TME4TP5019 | TME4TP6019 | TME4TP60Q219 | TME4TP60IP19 | TME5TP4019 | TME5TP5019 | TME5TP6019 | TME5TP60Q219 | TME5TP60IP19 | TOL1TP4019 | TOL1TP5019 | TOL1TP6019 | TOL1TP60Q219 | TOL1TP60IP19 | TOL2TP4019 | TOL2TP5019 | TOL2TP6019 | TOL2TP60Q219 | TOL2TP60IP19 | TLD2TP4019 | TLD2TP5019 | TLD2TP6019 | TLD2TP60Q219 | TLD2TP60IP19 | TLD3TP4019 | TLD3TP5019 | TLD3TP6019 | TLD3TP60Q219 | TLD3TP60IP19 | TYM1TP4019 | TYM1TP5019 | TYM1TP6019 | TYM1TP60Q219 | TYM1TP60IP19 | TYM2TP4019 | TYM2TP5019 | TYM2TP6019 | TYM2TP60Q219 | TYM2TP60IP19 | TYM3TP4019 | TYM3TP5019 | TYM3TP6019 | TYM3TP60Q219 | TYM3TP60IP19 | TYM4TP4019 | TYM4TP5019 | TYM4TP6019 | TYM4TP60Q219 | TYM4TP60IP19 | TYM5TP4019 | TYM5TP5019 | TYM5TP6019 | TYM5TP60Q219 | TYM5TP60IP19 | TYM6TP4019 | TYM6TP5019 | TYM6TP6019 | TYM6TP60Q219 | TYM6TP60IP19 | OPR1TP4019 | OPR1TP5019 | OPR1TP6019 | OPR1TP60Q219 | OPR1TP60IP19 | OPR2TP4019 | OPR2TP5019 | OPR2TP6019 | OPR2TP60Q219 | OPR2TP60IP19 | OPR3TP4019 | OPR3TP5019 | OPR3TP6019 | OPR3TP60Q219 | OPR3TP60IP19 | OPR4TP4019 | OPR4TP5019 | OPR4TP6019 | OPR4TP60Q219 | OPR4TP60IP19 | OPR5TP4019 | OPR5TP5019 | OPR5TP6019 | OPR5TP60Q219 | OPR5TP60IP19 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01004 | 16 | 22 | 28 | 7980 | 0.385 | 17 | 23 | 30 | 7950 | 0.388 | 23 | 31 | 37 | 7500 | 0.423 | 19 | 28 | 35 | 8060 | 0.379 | 14 | 19 | 23 | 7180 | 0.447 | NA | 10 | 15 | 9330 | 0.281 | NA | NA | NA | NA | NA | 12 | 17 | 22 | 7750 | 0.404 | 9 | 12 | 16 | 7930 | 0.389 | 13 | 20 | 26 | 8430 | 0.351 | 14 | 21 | 27 | 8420 | 0.352 | 34 | 45 | 53 | 7560 | 0.418 | 5 | 6 | 9 | 9460 | 0.271 | 26 | 35 | 43 | 7520 | 0.421 | 37 | 49 | 58 | 7170 | 0.448 | 14 | 19 | 28 | 8580 | 0.34 | NA | NA | 20 | 7030 | 0.459 | NA | 18 | 24 | 8430 | 0.351 | NA | NA | 9 | 9160 | 0.295 | 17 | 24 | 31 | 8280 | 0.362 | 31 | 40 | 49 | 6920 | 0.467 | NA | NA | NA | NA | NA | 12 | 18 | 25 | 8960 | 0.310 | 67 | NA | NA | 5980 | 0.539 | NA | NA | NA | NA | NA | 8 | 12 | 17 | 8760 | 0.326 | NA | NA | NA | NA | NA |
| 01007 | NA | 9 | 13 | 9400 | 0.276 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 38 | 8990 | 0.308 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 12 | 10020 | 0.229 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| 01014 | 8 | 13 | 19 | 9390 | 0.277 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 25 | 9690 | 0.254 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 34 | 9430 | 0.274 | NA | NA | 10 | 9790 | 0.246 | 17 | 27 | 37 | 8920 | 0.313 | NA | 30 | 41 | 9050 | 0.303 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 13 | 20 | 9480 | 0.270 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 10 | 17 | 10010 | 0.229 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| 01024 | NA | 8 | 12 | 9430 | 0.274 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 8 | 10000 | 0.230 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 11 | 8880 | 0.316 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 11 | 10410 | 0.198 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| 01025 | 5 | 7 | 11 | 9020 | 0.305 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 7 | 9760 | 0.249 | NA | NA | 34 | 8310 | 0.360 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 10 | 8400 | 0.353 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 9 | 9690 | 0.254 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
| 01031 | 17 | 23 | 29 | 7930 | 0.389 | NA | NA | NA | NA | NA | NA | 37 | 44 | 7660 | 0.410 | 23 | 32 | 40 | 8020 | 0.383 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 35 | 47 | 56 | 7310 | 0.437 | NA | NA | NA | NA | NA | 28 | 37 | 47 | 7850 | 0.396 | 30 | 41 | 52 | 7920 | 0.390 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 19 | 26 | 34 | 8290 | 0.361 | NA | NA | NA | 6640 | 0.489 | NA | NA | NA | NA | NA | 13 | 20 | 28 | 8780 | 0.324 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA |
This dataset does not have easily understable column names. The first is CODGEO, the town code. The columns starting with “TP” indicate poverty rates. the number afterwards indicate the rate at which people are considered poor. For example, with 60% which is the commonly used cutoff for poverty, the number means that a certain percentage of people earn only 60% of the median revenue in France. The column “TP60IP19” is the poverty intensity. This is the difference between the poverty threshold and the median revenue of people considered poor. The remaining columns divide those values by age.
Source: Poverty by town
2.2.1.7 Immigration by town
| NIVGEO | CODGEO | LIBGEO | AGE4 | IMMI | SEXE | NB |
|---|---|---|---|---|---|---|
| COM | 01001 | L'Abergement-Clémenciat | 00 | 2 | 1 | 82.648 |
| COM | 01001 | L'Abergement-Clémenciat | 15 | 1 | 1 | 1.019 |
| COM | 01001 | L'Abergement-Clémenciat | 15 | 2 | 1 | 39.016 |
| COM | 01001 | L'Abergement-Clémenciat | 25 | 1 | 1 | 3.965 |
| COM | 01001 | L'Abergement-Clémenciat | 25 | 2 | 1 | 145.677 |
| COM | 01001 | L'Abergement-Clémenciat | 55 | 1 | 1 | 7.300 |
Here we have the number of immigrants by town, age and gender. The first 3 columns are town indicators, code and name. the AGE4 variable indicate the age group. IMMI is the immigration status, 1 for immigrant, and 2 for non immigrants. SEXE is the gender, 1 for men and 2 for women and NB is the total number.
2.2.2 Cleaning
2.2.2.1 Crime
Code
Type_per_year <- Crimes |>
filter(valeur.publiée == "diff") |>
filter(!grepl("^97", CODGEO_2023)) |>
select(CODGEO_2023, annee, classe, faits, tauxpourmille)
Type_per_year <- cbind(Type_per_year[1], Type_per_year)
colnames(Type_per_year) <- c("Departement", "Town", "Year", "Type", "Number", "Rate_per_1k")
Type_per_year <- Type_per_year |>
mutate(Departement = substr(Departement, 1, 2)) |>
mutate(Rate_per_1k = str_replace_all(Rate_per_1k, ",", ".")) |>
mutate(Rate_per_1k = as.numeric(Rate_per_1k)) |>
group_by(Departement, Town, Year, Number, Type) |>
summarize(Rate_per_1k = mean(Rate_per_1k, na.rm = TRUE), ) |> #omit the missing values because some are missing even though they were indicated as published
group_by(Departement, Town, Year, Type, Rate_per_1k) |>
summarize(Number = sum(Number))We use this code to get the data by type and by year for each town. First, we only select values indicated as publishing their data (“diff”). We then only take metropolitan departments, by filtering every row except those starting with 97(the indicator for outer seas department). We then duplicate the town code to create a Department column. we rename the columns for more clarity. we then create our department column by substr() the first 2 digits of the town code. We now need to compute the rate for a thousand. To do this, we first must replace every comma by a point, that is because European notation of numbers uses a comma instead of a point for decimal numbers. we then use group_by() and summarize() to get the mean rate. and use the same functions to get the total number of crimes. This dataframe will be useful for our plots
Code
Crime_2019_town <- Type_per_year |>
filter(Year == 19) |>
group_by(Departement, Town) |>
summarize(Rate_per_1k = mean(Rate_per_1k),
Total_crime = sum(Number)) |>
rename(Town_code = Town, Dep_number = Departement)We then summarize the rate value by town to use in our regression
Code
#Create a dataset where we look at the rate for each type of crime
Crime_type_town <- Type_per_year |>
filter(Year == 19) |>
rename(Town_code = Town, Dep_number = Departement) |>
select(Dep_number, Town_code, Type, Rate_per_1k) |>
pivot_wider(names_from = Type, values_from = Rate_per_1k) |>
select(Dep_number, Town_code, "Cambriolages de logement",
"Coups et blessures volontaires",
"Destructions et dégradations volontaires",
"Usage de stupéfiants",
"Vols sans violence contre des personnes")
Crime_type_town[is.na(Crime_type_town)] <- 0We also create a dataset where we keep the type of crime commited. To do this, we simply do a pivot_wider() and select our types of crime. In this instance we only focus on the 5 most commited crimes. Since some cities do not have every type of crime present, we are left with a substantial amount of NA values. We decide to replace our NA’s with zeroes since an NA value indicates that there was less than 5 instances of a crime.
2.2.2.2 Density
Code
Density_town <- Dense |>
select(c(4, 11, 15, 19)) |>
rename(Town_name = `X4`,
Town_code = `X11`,
Pop_2012 = `X15`,
Size = `X19`) |>
filter(!str_detect(Town_code, "^97")) |>
mutate(Density_2019 = Pop_2012/Size) |>
select(Town_code, Town_name, Density_2019)To do this, we first select the columns we need. In that case, the name of the town, the Town_code, the population and the size. We then rename them to our standard, only select metropolitan departments and compute the density(Population/Size) lastly, we select only our important columns.
2.2.2.3 Population
Code
Pop_by_town <- Pop_2019 |>
group_by(CODGEO) |>
summarise(Total = sum(NB)) |>
rename(Town_code = CODGEO, Total_pop = Total)
view(Pop_by_town)The cleaning for this dataset is very simple, we compute the sum of every age group and gender group to find the total number of inhabitants by town.
2.2.2.4 Population without a diploma
Code
Pop_no_diploma <- Pop_16_no_diploma |>
select(c(2, 3, 6, 8, 10)) |>
slice(-1) |>
rename(Dep_number = `Département\nen géographie courante`,
Town_number = `Commune\nen géographie courante`,
Town_name = `Libellé de commune`,
no_diploma_M = `Aucun diplôme\nHommes\n25 ans ou plus\nRP2019`,
no_diploma_F = `Aucun diplôme\nFemmes\n25 ans ou plus\nRP2019`) |>
group_by(Dep_number, Town_number) |>
summarize(no_diploma = sum(as.numeric(no_diploma_F), as.numeric(no_diploma_M), na.rm = TRUE)) |>
mutate(Town_code = paste0(Dep_number, Town_number)) |>
select(Dep_number, Town_code, no_diploma) |>
filter(!str_detect(Dep_number, "^97")) |>
left_join(Pop_by_town, join_by(Town_code)) |>
na.omit() |>
mutate(No_diploma_rate1k = (no_diploma/Total_pop)*1000)
Paris_tot <- Pop_no_diploma |>
filter(Dep_number == "75") |>
summarize(No_diploma_rate1k = mean(No_diploma_rate1k)) |>
rename(Town_code = Dep_number) |>
mutate(Town_code = "75056")
Pop_no_diploma <- Pop_no_diploma |>
group_by(Town_code, No_diploma_rate1k) |>
select(Town_code, No_diploma_rate1k)
Pop_no_diploma <- rbind(Pop_no_diploma, Paris_tot)To have our final data, we first select our needed columns. we then remove the first row and rename the columns. we then compute the total number of people without a diploma and make sur to remove any NA values. we then select only metropolitan departments, with filter() and str_detect. We then compute a left join with the dataframe Pop_by_town to get the population by town. afterwards, we compute the rate at which people do not have a diploma for a thousand people by dividing the total number of people without any diploma by the population of the town and multiply it by a thousand.
Lastly, we have noticed that this dataset provides us with data from every subdivision in the city of Paris, while this level of precision could be welcome, all of our other datasets counts Paris as a single city. As such, we must aggregate our rate for the city of Paris. To do this, we create a new df called Paris_tot, we then summarize all of it. We add the row at the end of our previous dataframe. It is unordered, but will become ordered when we use the join function later on.
2.2.2.5 2017 election results.
Code
Vote_2017 <- Presidentielle_2017_Resultats_Communes_Tour_2_c |>
select(c(1, 2, 3, 25, 32)) #select ony the values important to us in that case, the department, the towncode and the result
Vote_2017 <- Vote_2017[1:35281, ]
colnames(Vote_2017) <- c("Department", "Dep_name", "Town_num", "Macron", "Lepen")
#Need to add the Corse departement number(2A/2B)
Vote_2017$Department <- as.character(Vote_2017$Department)
for (i in 1:nrow(Vote_2017)) {
if (Vote_2017$Dep_name[i] == "Corse-du-Sud") {
Vote_2017$Department[i] <- "2A"
}
if (Vote_2017$Dep_name[i] == "Haute-Corse") {
Vote_2017$Department[i] <- "2B"
}
}
#Now let's standardize the notation of departments and town codes by adding zeroes before.
for (i in 1:nrow(Vote_2017)) {
if (nchar(Vote_2017$Department[i]) == 1) {
Vote_2017$Department[i] <- paste0("0", Vote_2017$Department[i])
}
if (nchar(Vote_2017$Town_num[i])<=2) {
if (nchar(Vote_2017$Town_num[i]) <= 1) {
Vote_2017$Town_num[i] <- paste0("00", Vote_2017$Town_num[i])
}
else {Vote_2017$Town_num[i] <- paste0("0", Vote_2017$Town_num[i])
}
}
}
#Create the town key
Vote_2017 <- Vote_2017 |>
mutate(Town_code = paste0(Department, Town_num)) |>
mutate(Win_Lepen = ifelse(Lepen>50, 1, 0)) |>
select(Department, Town_code, Macron, Lepen, Win_Lepen)#Create a dummy variable that equals 1 if LePen "won" the town (scored more than 50%)we first select our needed columns. then, we only select metropolitan departments and rename our columns. the next step is to code the Corsica departments numbers. Corsica is split into 2 departments. 2A: “Corse-du-Sud” and 2B: “Haute-Corse”. these 2 departments are found between the department 19 and department 21. They used to be one single department butt were split in 2 in 1975. In our database, only the name can be found. we need the number. we use a for loop to code it into the Departement column. We then standardize our Departments numbers and town codes by adding leading zeroes. This then allows us to create our town_number key. Finally, we code a dummy variable that equals 1 if Marine Lepen had more than 50% of the votes in a town.
2.2.2.6 Poverty
Code
Povr_2019 <- Pov_2019 |>
select(c(1, 4, 6)) |>
na.omit() |>
rename(Town_code = CODGEO,
Povrety_2019 = TP6019,
Intensity_povrety = TP60IP19)Poverty is quite simple we select our columns, rename them and we do not forget to omit values that were blank
2.2.2.7 Immigration
Code
Immig_2019_town <- Immig_by_town |>
select(CODGEO, IMMI, NB) |>
rename(Town_code = CODGEO) |>
filter(IMMI == 1) |>
filter(!str_detect(Town_code, "^97")) |>
group_by(Town_code) |>
summarize(NB = sum(NB)) |>
left_join(Pop_by_town, join_by(Town_code)) |>
mutate(Immig_rate = (NB/Total_pop)*1000 ) |>
na.omit() |>
select(Town_code, Immig_rate)Here we apply the same framework we have applied before. We select our needed columns, rename them. we then filter to only have the total of immigrants. then, we add the population from our previously created population dataset using left_join() we then compute the immigration rate, omit missing values.
2.2.2.8 Join them into a single dataframe
Finally, we join all of those datasets into a single one. To do this, we first use full_join() to see how values are missing.
Code
Everything_by_town <- Crime_2019_town |>
full_join(Pop_by_town, join_by(Town_code)) |>
full_join(Density_town, join_by(Town_code)) |>
full_join(Vote_2017, join_by(Town_code)) |>
full_join(Povr_2019, join_by(Town_code)) |>
full_join(Pop_no_diploma, join_by(Town_code)) |>
full_join(Immig_2019_town, join_by(Town_code)) |>
left_join(Unemp, join_by(Dep_number))
summary(Everything_by_town)As we can see, Poverty is a smaller dataset. This may be a problem, but upon further inspection, we see that the smallest population is 1’700 inhabitants. And we still have around 5’000 observations. We previously thought about removing every observation where the number of inhabitants of a town was smaller than 200, because of the poor statistical analysis that would stem from such extreme values. the INSEE institute were most of our data has been collected recommends not doing any analysis on towns of less than 200 inhabitants. As such, the reduced number of observations with the poverty dataset is not a problem. We also add our unemployment numbers from our by_dep dataset because we were unable to find accurate unemployment numbers by department.
Code
Everything_by_town_clean <- Everything_by_town |>
na.omit() |>
select(Dep_number,
Town_code,
Town_name,
Total_pop, Rate_per_1k, Density_2019, Lepen, Win_Lepen, Povrety_2019, Intensity_povrety, No_diploma_rate1k, Immig_rate, Unemp_2019)then, we omit our missing values, select only our very important values and we have our final dataset. that we may use for EDA and most importantly for our Regression analysis. Were we will predict the crime rate of a town using our independent variables.
We then create a dataset for each type of crime per town.
Code
Crime_per_type_town <- Crime_type_town |>
full_join(Everything_by_town_clean, join_by(Town_code)) |>
select(Dep_number.x, Town_code, Town_name,
Total_pop, Total_crime, Rate_per_1k,
"Cambriolages de logement", "Coups et blessures volontaires", "Destructions et dégradations volontaires",
"Usage de stupéfiants", "Vols sans violence contre des personnes",
Density_2019, Lepen, Win_Lepen,
Povrety_2019, Intensity_povrety, No_diploma_rate1k,
Immig_rate, Unemp_2019) |>
na.omit() |>
rename(Assault = `Coups et blessures volontaires`,
Burglary = `Cambriolages de logement`,
Damage = `Destructions et dégradations volontaires`,
Drugs = `Usage de stupéfiants`,
Theft = `Vols sans violence contre des personnes`,
Dep_number = Dep_number.x
) |>
select(-Year) |>
mutate(Lepen= Lepen/100,
Povrety_2019 = Povrety_2019/100,
No_diploma_rate1k = No_diploma_rate1k/1000,
Immig_rate = Immig_rate/1000,
Unemp_2019 = Unemp_2019/100)2.3 Final results
We finally have these datasets.
The Department level:
The town level:
| Dep_number | Town_code | Rate_per_1k | Total_crime | Total_pop | Town_name | Density_2019 | Department | Macron | Lepen | Win_Lepen | Povrety_2019 | Intensity_povrety | No_diploma_rate1k | Immig_rate | Dep_name | Unemp_2019 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 01 | 01001 | 0.000 | 0 | 779 | L'ABERGEMENT-CLEMENCIAT | 49.15 | 01 | 60.04 | 39.96 | 0 | NA | NA | 73.93 | 26.30 | Ain | 6.075 |
| 01 | 01002 | 0.000 | 0 | 256 | L'ABERGEMENT-DE-VAREY | 24.15 | 01 | 62.00 | 38.00 | 0 | NA | NA | 21.61 | 12.34 | Ain | 6.075 |
| 01 | 01004 | 4.659 | 722 | 14134 | AMBERIEU-EN-BUGEY | 562.40 | 01 | 58.89 | 41.11 | 0 | 28 | 0.385 | 106.41 | 111.73 | Ain | 6.075 |
| 01 | 01005 | 3.736 | 12 | 1751 | AMBERIEUX-EN-DOMBES | 101.51 | 01 | 52.09 | 47.91 | 0 | NA | NA | 72.95 | 40.97 | Ain | 6.075 |
| 01 | 01006 | 0.000 | 0 | 112 | AMBLEON | 19.73 | 01 | 50.85 | 49.15 | 0 | NA | NA | 45.45 | 72.73 | Ain | 6.075 |
| 01 | 01007 | 5.021 | 74 | 2800 | AMBRONAY | 70.40 | 01 | 56.29 | 43.71 | 0 | 13 | 0.276 | 42.86 | 36.07 | Ain | 6.075 |
As we can see, this dataset includes many towns of very few inhabitants. And many Missing values have been introduced into this dataset. As such, we will omit those missing values that were mostly located in the Poverty department. We are left with this final clean dataset
The smallest town in this dataset is 1711. This will be a much better dataset to run our regressions on.
3 EDA
3.1 Visualization
3.1.1 Crime
3.1.1.1 Evolution of the total crime rate
In this graphic we chose to compute the crime rate per hundred persons in the 5 departments with the biggest and the 5 with the lowest crime rate.
In this graphic we chose to compute the crime rate per hundred persons in the 5 departments with the highest crime rate and those with the lowest crime rate. The first thing we can observe is the major disparity between the average crime number in those departments. In fact the highest crime rate in the Paris is almost 5 times higher than the Aveyron 30.85 vs 145.25 It’s important to note that the departments with the higher crime rate are the one encompassing the biggest cities in France: Paris and it’s suburbs of “Seine-Saint-Denis” and “Hauts-de-Seine”, Marseille is in the “Bouches-du_Rhône”, Lyon is located in the “Rhône”. We can start to understand that density may be a key factor in determining crime rate.
Evolution of Crime in the 5 most criminal departments
Here we can observe that the total crimes in the most criminal departments are lower than what they were twenty years ago in the city of Paris (75) and Marseille (13) but are sensibly the same or higher in the peripheral areas of Paris (92/93) and in the city of Lyon (69). This can possibly be explained by the huge increase of density of population in the peripheral areas of big cities in France.
This graph represents the annual amount of crimes in the 5 least criminal departments. Here we can assume that the number of crimes in the safest departments is quite stable through the years. We also observe the small variation due to COVID-19 Pandemic
3.1.1.2 Types of crime commited
We start by looking at the evolution of the most common types of crime by year in France.
Here we can see that the most common crime in France is deliberate destruction and damage. The effect of the COVID-19 pandemic can also be observed as most crimes have seen an overall drop in 2020. For example, theft without violence saw a sharp drop and its levels have not gone back to 2019 levels of theft.
The Types of crime committed also depend on the location.
In these graphs, we can see the difference in the types of crime committed in the Paris department and in the Cher department. Cher is sparsely populated department. Here we observe that in Paris, theft without violence is the number one committed crime. This is surely due to Paris being a big city with many tourists and as such many pickpockets. This may not be the case in the Cher region with only 300’000 inhabitants. In the Cher, theft is less prevalent with the biggest Crime rate being deliberate destruction and damage.
3.1.2 Unemployment
3.1.2.1 Evolution of the unemployment rate
Let’s now take a look at our first variable: Unemployment. Unemployment in France has evolved through time and has usually been higher than unemployment in Switzerland.
We can observe a huge increase in the unemployment rate in the 90’s in France. As we know after the Second World War, European countries had an important period of economic growth, this period is known as the « 30 glorieuses ». This period matched with an important demand of workers, from Europe and from other continents. There was a need to rebuild the country and an important diversification of the jobs with help of technological progress. In the early 80’s, an important recession started in Europe, with a decrease of job creation in most of the important European countries, especially Germany, Great Britain and France. The the unemployment rate was approximately increasing by 1% a year in France in 1990, to reach the record rate of 11% in 1997. This is mainly explained by the two oil shocks in the 70’s and 80’s. The French president at the time, Francois Mitterand said in 1993 : « In terms of unemployment, we’ve tried everything and we’ve failed everything ». Following this recession France had an important decrease of unemployment rate, the explanations are multiple but the most plausible is the creation of the Euro zone and currency. This tendency lasted until the major financial crisis of 2008. In the last decade the unemployment rate has been mastered once again until the COVID 19 pandemic in 2020 which created a massive suppression of jobs in various areas such as the show industry or the food service sector.
3.1.2.2 Unemployment by department
We can observe with this graph that there’s an important disparity in unemployment rates in France. It’s also relevant that one of the departments with the higher crime rate which we’ve seen before, is also one of the departments with the higher crime rates (93). This may be an indication of what we could possibly find with the statistical regressions.
3.1.3 Immigration
We can use a Treemap to visualize the total number of immigrants by Region. Regions in France contain several Departments. ::: {.cell} ::: {.cell-output-display} ::: :::
As we can see, most immigrants are in the Paris area. These visualizations really put into perspective how big the Paris region is compared to the rest of the country.
3.2 Mapping
This map compute the crime rate in every French department and shows us the important disparity in the amount of crimes committed across the country. As we can see, Paris has a very high crime rate compared to the rest of the departments.
This map computes the unemployment percentage in every department, Unemployement is higher in the north and south as well as around Paris.
This map shows us the score in the second round of the French presidential election of 2017, which we choose to compute as a possible explanation variable. As we know Emmanuel Macron has been elected president with 66.10% so the departments overall voted for him, however it is interesting to see the departments where Marine LePen had an important share of the votes. Here those departments are represented with a darker color.
This map shows the school pass rate for each department at the end of middle school. Unfortunately, no clear trend seems to emerge from the map. Futhermore, the difference between the highest value (0.9305) and the lowest value (0.8274) is only of 10%. This means that all other values are contained within only 10%. Showing them on a map with colors tends to over-exagerate the differences between values. As such, this map has poor vizualization power.
Lastly, this map shows us the immigration rate, by department. As we can see, Immigrants are mostly located in the vicinity of the Paris region. Other notable areas of higher immigration are border regions in the south and in the east.
Here the density can be observed, with Paris and it’s region being an outlier, it is hard to see the difference between other regions. This maps shows clearly that most people in France live around Paris.